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Abstract 

A covariance graph is an undirected graph associated with a multivariate probability 
distribution of a given random vector where each vertex represents each of the different 
components of the random vector and where the absence of an edge between any pair of 
variables implies marginal independence between these two variables. Covariance graph 
models have recently received much attention in the literature and constitute a sub-family of 
graphical models. Though they are conceptually simple to understand, they are considerably 
more difficult to analyze. Under some suitable assumption on the probability distribution, 
covariance graph models can also be used to represent more complex conditional indepen- 
dence relationships between subsets of variables. When the covariance graph captures or 
reflects all the conditional independence statements present in the probability distribution 
the latter is said to be faithful to its covariance graph - though no such prior guarantee ex- 
ists. Despite the increasingly widespread use of these two types of graphical models, to 
date no deep probabilistic analysis of this class of models, in terms of the faithfulness as- 
sumption, is available. Such an analysis is crucial in understanding the ability of the graph, 
a discrete object, to fully capture the salient features of the probability distribution it aims 
to describe. In this paper we demonstrate that multivariate Gaussian distributions that have 
trees as covariance graphs are necessarily faithful. The method of proof is original as it uses 
an entirely new approach and in the process yields a technique that is novel to the field of 
graphical models. 

1 Introduction 

Markov random fields or graphical models are widely used to represent conditional indepen- 
dences in a given multivariate probability distribution (see Kunsch et al. (1995), Ji & Seymour 
(1996), Spitzer (1975), Kindermann & Snell (1980), Lauritzen (1996) to name just a few). Many 
different types of Markov Random fields or graphical models have been studied in the literature. 
For example, directed acyclic graphs or DAGs are commonly referred to as "Bayesian networks" 
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(see Pearl (1988)). When the graph is undirected and when such graphs are constructed using 
marginal independence relationships between pairs of random variables in a given random vector 
these graphical models are called "covariance graph" models (see Cox & Wermuth (1993), Cox 
& Wermuth (1996), Kauermann (1996), Malouche & Rajaratnam (2009) and Khare & Rajarat- 
nam (2009)). Covariance graph models are commonly represented by graphs with exclusively 
bi-directed or dashed edges (see Kauermann (1996)). This representation is used in order to dis- 
tinguish them from the traditional and widely used concentration graph models. Concentration 
graphs encode conditional independence between pairs of variables given the remaining ones. 
Formally, if we consider a random vector X = (X v ,v G V) 1 with a probability distribution P 
where V is a finite set representing the random variables in X. The concentration graph associ- 
ated with P is an undirected graph G = (V,E) where 

• V is the set of vertices. 

• Each vertex represents one variable in X. 

• E is the set of edges (between the verices in V) constructed using the pairwise rule : for 
pair (u, v) G V x V, u ^ v 

(u,v)#E X u ALX v \X v \ {uM (1) 

where X-v\{u,v} := (X w , w / u and w / v )'. 

Note that (u, v) E means that the vertices u and v are not adjacent in G. 

An undirected graph Go = (V, Eq) is called the covariance graph associated with the proba- 
bility distribution P if the set of edges Eq is constructed as follows 

(u,v) E X U ALX V (2) 

The subscript zero is invoked for covariance graphs (i.e., Go vs G) as the definition of covariance 
graphs does not involve conditional independences. 

Both concentration and covariance graphs are not only used to encode pairwise relationships 
between pairs of variables in the random vector X, but as we will see below, these graphs can be 
used to encode conditional independences that exist between subsets of variables of X. First we 
introduce some definitions: 

The multivariate distribution P is said to satisfy the "intersection property" if for any subsets 
A, B C and D of V which are pairwise disjoint, 

Xa -II Xg | Xcud 
< and then X A ALX BuC \X D (3) 

, Xa -II Xc | X-bud 

We will call the intersection property (see Lauritzen (1996)) in (3) above the concentration 
intersection property in this paper in order to differentiate it from another property that is satisfied 
by P when studying covariance graph models. 

Let P satisfy the concentration intersection property. Then for any triplet (A,B,S) of sub- 
sets of V pairwise disjoint, if S separates 1 A and B in the concentration graph G associated with 

'We say that S separates A and B if any path connecting A and B in G intersects S, i.e., ALgB \ S, and is not 
to be confused with stochastic independence which is denoted by _LL as compared to J_g. 
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P then the random vector = (X v , v G A)' is independent of = (X v , v <G B)' given 
Xs = v <G S)'. This latter property is called concentration global Markov property and is 
formally defined as, 

A± G B | S XUX B I X S . (4) 

Kauermann (1996) and Banerjee & Richardson (2003) show that if P satisfies the following 
property : for any triplet (A, B, S) of subsets of V pairwise disjoint, 

if X A ALX B and X A ALX C then X A _U_ X BuC , (5) 

then for any triplet (A, B, S) of subsets of V pairwise disjoint, if V \ (A U B U S) separates A 
and B in the covariance graph Go associated with P then _LL X B X#. This latter property 
is called the covariance global Markov property and can be written formally as follows 

A± Go B I V \ (A U B U S) => X A ALX B | X 5 . (6) 

In parallel to the concentration graph case, property (5) will be called the covariance intersection 
property. 

Even if P satisfies both intersection properties, the covariance and concentration graphs may 
not be able to capture or reflect all the conditional independences present in the distribution, i.e., 
there may exist one or more conditional independences present in the probability distribution 
that does not correspond to any separation statement in either G or Go. Equivalently, a lack of a 
separation statement in the graph does not necessarily imply conditional independences. On the 
contrary case when no other conditional independence exist in P except the ones encoded by the 
graph, we classify P as a faithful probability distribution to its graphical model. More precisely 
we say that P is concentration faithful to its concentration graph if for any triplet (A, B,S) of 
subsets of V pairwise disjoint, the following statement holds : 

S separates A and B X^ _LL X B | X 5 . (7) 

Similarly, P is said to be covariance faithful to its covariance graph Go if for any triplet (A, B, S) 
of subsets of V pairwise disjoint, the following statement holds : 

V \ (A U B U S) separates A and B X A 1X B |X S . (8) 

A natural question of both theoretical and applied interest in probability theory is to understand 
the implications of the faithfulness assumption. This assumption is fundamental since it yields a 
bijection between the probability distribution P and the graph G in terms of the independences 
that are present in the distribution. In this paper we show that when P is a multivariate Gaussian 
distribution whose covariance graph are trees are necessarily covariance faithful, i.e., these prob- 
ability distributions satisfy property (8), i.e., the associated covariance graph G is fully able to 
capture all the conditional independences present in the multivariate distribution P. This result 
can be considered as a dual of a previous probabilistic result proved by Becker et al. (2005) for 
concentration graphs that demonstrates that Gaussian distributions having concentration trees, 
i.e., the concentration graph is a tree are necessarily concentration faithful to its concentration 
graph (implying property (7) is satisfied). This result was proved by showing that Gaussian 
distributions satisfy an additional intersection property. The approach in the proof of the main 
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result of this paper is vastly different from the one used for concentration graphs by Becker et al. 



The outline of this paper is follows. Section 2 presents graph theory preliminaries. Section 
3 gives a brief overview of covariance and concentration graphs associated with multivariate 
Gaussian distributions. Furthermore, an easier way to encode conditional independence using 
covariance graphs is given in Section 3. The prove of the main result of this paper is given in 
Section 4. Section 5 concludes by summarizing the results in the paper and the implications 
thereof. 

2 Graph theory preliminaries 

This section introduces notation and terminology that is required in subsequent sections. An 
undirected graph G = (V, E) consists of two sets V and E, with V representing the set of 
vertices, and E C (V x V") \ {(u, u), u € V} the set of edges satisfying : 



For u, v G V, we write u ~g v when (u, v) <G E and we say that u and v are adjacent in G. 

Definition 1 A path connecting two distinct vertices u and v in G is a sequence of distinct 
vertices (no, u±, . . . , u n )) where uo = u and u n = v where for every i = 0, . . . , n — 1, U{ ^ G 

Such a path will be denoted p = p(u, v, G) and we say that p(u, v, G) connects u and v or 
alternatively u and v are connected by p(u, v, G). Its length, denoted by \p(u, v, G)\, is defined 
as the number of edges connecting the vertices of p. So, in this case \p(u, v,G)\ = n. We also 
denote by V(u, v, G) the set of paths between u and v. 

Trees are a particular class of graphs that are studied in this paper. This class of graphs are 
formally defined below. 

Definition 2 Let G = (V, E) be an undirected graph. The graph G is called a tree if any pair 
of vertices (u, v) in G are connected by exactly one path, i.e., \V(u, v, G)\ = 1 V u,v £ V. 

A subgraph of G induced by a subset U C V is denoted by G\j = (U, Ejj), U C V and 
Eu = Ef](U x U). 

Definition 3 A connected component of a graph G is the largest subgraph Gjj = (U, Ejj) of G 
such that each pair of vertices can be connected by at least one path in Gy. 

We now state a Lemma needed in the proof of the main result of this paper. 

Lemma 1 Let G = (V, E) be an undirected graph. If G is a tree, any subgraph of G induced 
by a subset ofV is a union of connected components, each of which are trees (or what we shall 
refer to as a "union of tree connected components"). 



(2005). 
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Proof. Consider U C V, the induced graph Gjj and a pair of vertices (u, v) G U x U. Let us 
assume to the contrary that u and v are connected by two distinct paths p\ and p 2 in Gjj (i.e., Gjj 
is not a tree). As the set of edges Ejj of the graph Gjj is included in the set of edges E of G, i.e., 
£x/ = E Pi (U x U) C. E, then pi and j>2 are also paths in G. Hence u and v are vertices in G 
which are connected by two distinct paths, i.e., pi and p 2 . This of course yields a contradiction 
with the fact that G is a tree. Thus any pair of vertices in G\j are connected by at most one path 
and, hence Gjj is a union of connected components, each of which are trees (or a "union of tree 
connected components"). ■ 

Definition 4 For a connected graph, a separator is a subset SofV such that there exists a pair 
of non- adjacent vertices u and v such that u, v G" S and 

V P £V(u,v,G), (9) 

If S is a separator then it is easily verified that every S' 5 S such that S' C V \ {u, v} is 
also a separator. We are thus lead to the notion of a minimal separator. 

Definition 5 The separator S is defined to be a minimal separator between two non-adjacent 
vertices u and v if for any w G S, the subsets S \{w} is not a separator of u and v. 

Note that in the case where G contains more than two connected components and if u and v 
belong to different connected components the empty set is the only possible separator of u and 
v. Finally, let A, B and S be pairwise disjoint subsets of V. We say that S separates A and 
B if for any pair of vertices (u, v) G A x B, any path connecting u and v intersects S. In the 
case where A and B belong to different connected components of G the subset S can be empty 
because the set of paths between any pair of vertices (u, v) G A x B is empty. 

3 Gaussian Concentration and Covariance Graphs 

In this section we present a brief overview of concentration and covariance graphs in the case 
when the probability distribution P is multivariate Gaussian. Such graphical models are com- 
monly referred to as Gaussian covariance or Gaussian concentration graph models. 

3.1 Gaussian concentration graph models 

Consider a probability space with triplet (fi,^ 7 , P) and let X : U —> PJ y l be a random vector 
where X = (X v , v G V)' and P represents the induced measure of P by X. If X follows a 
Gaussian distribution then it has the following density function with respect to Lebesgue measure 

/(X) = (27r) | V |) 2 | S |l/2 ^ ("^ X " ^'^(X " M)') , (10) 

where x = (x u , u G V)' G IR'^', fj, G IRJ^' is the mean vector and S = (a uv ) G V + is the co- 
variance matrix with V + denoting the cone of symmetric positive definite matrices. Without loss 
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of generality we will assume that p, = 0. As any Gaussian distribution with fi = is completely 
determined by its covariance matrix S, this set of multivariate Gaussian distributions can there- 
fore be identified by the set of symmetric positive definite matrices. Gaussian distributions can 
also be parameterized by the inverse of the covariance matrix £ denoted by K = XT 1 = (k uv ). 
The matrix K is called the precision or concentration matrix. It is well known (see Lauritzen 
(1996)) that for any pair of variables (X u , X v ), where 

X u _LL X v | ^-v\{u,v} kuv = 0. 

Hence the concentration graph G = (V, E) can be constructed simply using the precision matrix 
K and the following rule 

(u, v) E k uv = 0. 

Furthermore it can be easily deduced from a classical result in Hammersly & Clifford (1971), 
that is reproved in Lauritzen (1996), that any multivariate random vector with a positive density 
necessarily satisfies the concentration intersection property (3). Hence for Gaussian concentra- 
tion graph models the pairwise Markov property in (1) is equivalent to the concentration global 
Markov property in (4). 

3.2 Gaussian covariance graph models 

As seen earlier in (2) covariance graphs are constructed using pairwise marginal independence 
relationships. It is also well known that for multivariate Gaussian distributions : 

X u 1L X v a uv = 0. 

Hence in the Gaussian case the covariance graph Go = (V, Eq) can be constructed using the 
following rule : 

(u, v) Eq <j=^> a uv = 0. 

It is also easily seen that Gaussian distributions satisfy the covariance intersection property de- 
fined in (5). Hence Gaussian covariance graphs can also encode conditional independences ac- 
cording to the following rule : for any triplet (A,B,S) of subsets of V pairwise disjoint, if 
V \ (A U B U S) separates A and B in the covariance graph Go then _LL X# | X5. We 
now show (see proposition 2 below) that there is a simple way to read conditional independence 
statements from the covariance graph. This result holds true for any probability distribution that 
satisfy the covariance intersection property given in (5). 

Proposition 2 Let Xy = (X v , v £ V)' be a random vector with probability distribution P 
satisfying the covariance intersection property in (5) and let Go = (V,Eq) be the covariance 
graph associated with P. Then the following statements are equivalent, 

i. for any pairwise disjoint subsets A, B and SofV: ifV\(AuBuS) separates A and 
B in G then X A -LL X B | X s 

ii. for any pairwise disjoint subsets A, B and SofV: if S separates A and B in Go then 
X A ALX B I Xv\(Aubus) 
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Proof. Let us first assume that (i) is satisfied and let us prove (ii). 

Let A, B and S be three pairwise disjoint subsets of V such that S separates A and B in Gq. 
Note that we can write S as follows: 

S = V \ (V \ (A U B U S)) U A U B) 

Since (V\(AUBUS)UAUB = V\ S and V\(V\S) = S. 

By hypothesis S 1 separates A and £ in G . Let S" = V \ (i U B U 5) and since S = 

V \ (S' U A U 5) we can apply property (i) to the triplet (A, B, S'). Hence X A _LL X B \ X s >. 
Hence X A _LL X# | X v \( SuAuB ) since S' := V \ (S U A U 73). We have therefore proved that if 
S separates A and B in Go, then X A ALXb \ ^-v\(SuAub)- 

Assume now that property (ii) is satisfied and let A, B and S be three pairwise disjoint 
subsets of V such that V \ (S U A U B) separates A and B in Gq. Let us denote by S' = 

V \ (S U A U B) which is a subset separating A and B in Go- Since (ii) is satisfied, we deduce 
fhatXA-LLX B | Xy\ (AuBuS ./). However 

V \ (A U E U S') = V \ {(V \ {A U B U S)) U A U B) = S 

Hence we conclude that V \ (A U B U S) separates A and B in Go implies that X A Jl X^ | X5. 
Thus property (i) is satisfied. ■ 

Proposition 2 can be used to formulate an equivalent definition of the covariance faithfulness 
property. 

Definition 6 Let Xy = (X v , v £ V)' be a random vector with probability distribution P satis- 
fying the covariance intersection property in (5) and let Gq = (V, Eq) be the covariance graph 
associated with P. We say that P is covariance faithful to Gq if for any pairwise disjoint subsets 
A, B and SofV the following condition is satisfied 

S separates A and B <^=^ X A -U-Xb \ Xv\(AubuS) 

The above reformulation of the covariance faithfulness property is an important ingredient in the 
proofs in the next section. 

4 Gaussian Covariance faithful trees 

We now proceed to study the faithfulness assumption in the context of multivariate Gaussian 
distributions and when the associated covariance graphs are trees. 

The main result of this paper, presented in Theorem 3, proves that multivariate Gaussian 
probability distributions having tree covariance graphs are necessarily faithful to their covari- 
ance graphs. The analogous result for concentration graphs was demonstrated by Becker et al. 
(2005) where the authors proved that Gaussian distributions having tree concentration graphs 
are necessarily faithful to these graphs. We now formally state Theorem 3. The proof follows 
shortly after a series of lemmas/theorem(s) and an illustrative example. 
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Theorem 3 Let Xy = (X v , v £ V)' be a random vector with Gaussian distribution P = 
N\ v \{n, Let G = (V, E ) be the covariance graph associated with P. If Go is a tree or 

more generally a union of connected components each of which are trees (or a union of "tree 
connected components"), then P is go— faithful to Go- 

The proof of Theorem 3 requires among others a result proved by Jones & West (2005). This 
result gives a method that can be used to compute the covariance matrix £ from the precision 
matrix K using the paths in the concentration graph G. The result can also be easily extended to 
show that the precision matrix K can be computed from the covariance matrix S using the paths 
in the covariance graph Gq. We now state the result by Jones & West (2005). 

Theorem 4 Jones & West (2005). 

Let Xy = (X v , v £ V)' be a random vector with Gaussian distribution P = A/iyi(/x, S) 
where £ and K = S^ 1 are positive definite matrices. Let G = (V, E) and Go = (V, Eq) denote 
respectively the concentration and covariance graph associated with the probability distribution 
ofXy. 

For all (n, v ) inV x V 

K v = £ (-i) |p|+1 K^ 

P eV(u,v,G ) 

and 

0~uv 

where, if p = (uo,...,u n ), 
K\p=(k 

uvi { u i v ) £ {y \ p) x (y \p)) an d S \ p — (a uv , (u, v) € (V \ p) x (V \ p)) de- 
note respectively K and E with rows and columns corresponding to variables in path p omitted. 
The determinant of a zero-dimensional matrix is defined to be 1. 

The proof of our main theorem (Theorem 3) also requires the results proved in the lemma 
below. 

Lemma 5 Let Xy = (X v , v G V)' be a random vector with Gaussian distribution P = 
N\y\(p,K = Let Go = (V,Eo) and G = (V,E) denote respectively the covariance 

and concentration graphs associated with P, then 

i. G and Go have the same connected components 

ii. If a given connected component in Go is a tree then the corresponding connected compo- 
nent in G is complete and vice-versa. 

Proof. 



peV(u,v,G) 



\K\p\ 
\K\ 
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Proof of (i). 

The fact that Go and G have the same connected components can be deduced from the 
matrix structure of the covariance and the precision matrix. The connected components 
of Go correspond to block diagonal matrices in S. Since K = X -1 , then by properties 
of inverting partitioned matrices, K also has the same block diagonal matrices as S in 
terms of the variables that constitute these matrices. These blocks corresponds to distinct 
components in G and Go- Hence both matrices have the same connected components. 

Proof of (ii). 

Let us assume now that the covariance graph Go is a tree, hence it is a connected graph 
with only one connected component. We shall prove that the concentration graph G is 
complete by using Theorem 4 by Jones & West (2005) and computing any coefficient k uv 
(u ^ v). Since Go is a tree, there exists exactly one path between between any two vertices 
u and v. We shall denote this path as p = (uq = u,..., u n = v). Then by Theorem 4 

k U V = ( — 1) + CuoMl • • • j^Jj (11) 

First note that the determinant of the matrices in (1 1) are all positive since principal minors 
of positive definite matrices are positive. Second since we are considering a path in Go, 
Gui-xui 7^ 0, V i = 1, . . . , n. Using these two facts we deduce from (11) that k uv / for 
all (u, v) € E. Hence u and v are adjacent in G for all (u, v) G E. The concentration 
graph G is therefore complete. The proof that when G is assumed to be a tree implying 
that Go is complete follows similarly. 



Remark. We further note that Theorem 4 is also directly useful in deducing the completeness of 
the concentration graph by using the covariance graph in other settings. As a concrete example 
consider the case when Go is a cycle with an even number of edges s.t. \V\ = 2k for some 
odd integer k, and assume that all the coefficients in the covariance matrix £ of Xy are positive. 
Hence a given pair of vertices (u, v) in Go are connected by two paths which are both of odd 
length. Let us denote these paths as p\ and p2. Using Theorem 4, it is easily deduced that 

, _ IgVglj , l S \P2| 

Kuv — °~\ P1 | r (T\ P2 1 

Here \a Pl | and \a Pl \ are different from zero as they are both equal to a product of positive coef- 
ficients. Hence k uv ^ 0. The same argument can also be used in the case when p\ and pi both 
have even length (i.e., |V| = 2k for some even integer k) to deduce that k uv / 0. Hence u and 
v are adjacent in the concentration graph G; thus G is necessarily complete. 

We now give an example illustrating the main result in this paper (Theorem 3). 

Example 1 Consider a Gaussian random vector X = (Xi, . . . , X$)' with covariance matrix 
E and its associated covariance graph as given in Figure 1. Consider the sets A = {1,2}, 
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Figure 1: An 8— vertex covariance tree Gq. 



B = {5} and S = {4, 6}. Note that S does not separate A and B in Go as any path from A 
and B does not intersect S. In this case we cannot use the covariance global Markov property 
to claim that Xa is not independent ofXs given Xy^AuBuS)- This is because the covariance 
global Markov property allows us to read conditional independences present in a distribution if a 
separation is present in the graph. It is not an "if and only if" property in the sense that the lack 
of a separation in the graph does not necessarily imply the lack of the corresponding conditional 
independence. We shall show however that in this example that Xa is indeed not independent of 
Xb given Xy\( A uBuS)- ^ n other words we shall show that the graph has the ability to capture 
this conditional dependence present in the probability distribution P. 

Let us now examine the relationship between X2 and X§ given X^ 7 8 }. Note that in this 
example V \ {A U B U S) = {3, 8, 7}, 2 G A and 5 G B. Note that the covariance graph asso- 
ciated with the probability distribution of the random vector (X2, X^,X^ S j8 jy)' is the subgraph 
represented in Figure 2 and can be obtained directly as a subgraph of Go induced by the subset 
{2,5,3,7,8}. 




Figure 2: the covariance graph (G ){2,5,3,8,7} 



Since 2 and 5 are connected by exactly one path in (Go){2,5,3,7,8}> that is p = (2, 3, 5), then 
the coefficient &i25|387> the coefficient between 2 and 5 in inverse of the covariance matrix of 
(X2, -X5, Xr 3 5 § j})', can be computed using Theorem 4 as follows 

k ( n2+i„ „ |S({8,7})| 

^25|387 = ("I) ^23 ( 12 ) 

where S({7, 8}) and S({2, 5, 3, 8, 7}) are respectively the covariance matrices of the Gaussian 
random vectors (X7, Xg)' and (X2, X^,Xf S 8 j\)'. Hence &25|387 since the right hand side 
of the equation in (12) is different from zero. Hence X2 -\/-X$ | Xr 3i8j7 }. 

Now recall that for any Gaussian random vector vector Xy = (X u , u G V)' , 

X A ALX B \X c if and only if V(«,v) G Ax B, X U ALX V \X C (13) 
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where A, B and C are pairwise disjoint subsets ofV. The contrapositive of (13) yields 
X 2 -\JL X 5 | ^{3,7,8} => X {1,2} -\t- X 5 I ^{3,7,8}- 

Hence we conclude that since {3, 7, 8} does not separate {1, 2} and {5} therefore ^{1,2} i s 
not independent of given X{3j$y, i.e., 

{1,2} JL Go {5} I {3,7,8} ^X {lj2} 41^5 I * { 3j,8} 

We now proceed to the proof of Theorem 3. Proof, of Theorem 3. Without loss of generality 
we assume that Go is a connected tree. Let us assume to the contrary that P is not covariance 
faithful to Go, then there exists a triplet (A, B, S) of pairwise disjoint subsets of V, such that 
Xa ILXb I Xy\(AuBuS)> but S does not separate A and B in Go, i.e., 

X A ALX B \ X V \ (AUBUS) and A JL Gq B \ S 

As S does not separate A and B and since Go is a connected tree, then there exists a pair of 
vertices (u, v) G A x B such that the single path p connecting u and v in Go does not intersect 
S, i.e., SDp = 0. Hence p <ZV\S = (AUB)U(V\(AUBUS)). Thus two cases are possible 
with regards to where the path p can lie : either p C A U B or p n (V \ (A U B U S)) / 0. Let 
us examine both cases separately. 

• Case IijjCAuB 

In this case the entire path between u and v lies in A U B and hence we can find a pair of 
vertices 2 (V, v') belonging to p and (u', v') <G A x B such that v! ~c u'. 



Recall that since Go is a tree, any induced graph of Go by a subset of V is a union of 
tree connected components (see Lemma 1). Hence the subgraph (Go)w of Go induced 
by W = {u',v'} U V \ (A U B U S) is a union of tree connected components. As 
u' and v' are adjacent in Go, they are also adjacent in (Gq)w and belong to the same 
connected component 3 of (Gq)w- Hence the only path between u' and v' is precisely the 
edge (V, v'). Using theorem 4 to compute the coefficient k u / v /\y\(A U BuS)> i- e -> { u 'i v')th 
coefficient in the inverse of the covariance matrix of the random vector X\y = (X w , w <G 
W)' = (X u ,,X v ,,X v \ {AuBuS) y, we obtain, 

1+1 \Z(W\{u',v>})\ 

K U 'V'\V\(AUBUS) = l-J-J Vu'v' ' 

where S(W) denotes the covariance matrix of Xw, and T,(W\{u', v'}) denotes the matrix 
S(W) with the rows and the columns corresponding to variables X u i and X v < omitted. We 



2 As an illustration of this point consider the graph presented in Figure 1. Let A — {1, 2}, B = {3, 5} and 
S — {4, 6}. We note that the path p = (1,2,3,5) lies entirely in A U B and hence we can find two vertices, namely, 
2 G A and 3 £ B, belonging to path p that are adjacent in Go- 

3 In our example in Figure 1 with W = {2, 3, 8, 7}, (Go)w consists a union of two connected components with 
its respective vertices being {2, 3} and {8, 7}. 
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can therefore deduce from (14) that fe u viv\( J 4u.BuS) / 0- Recall that at the start of the 
proof we assumed to the contrary that _LL X B | ^-v\(AubuS) ■ Now since P is Gaussian, 
for pairwise disjoint subsets A, B,V \ (AU B U C) then 

X A ALX B | X v \( AuBuC ) 44> V (u, v) G A x B, X U ALX V \ X v \( AuBuC ) (15) 

Note however that we have established that X u > -1JLX V > \ X v \^ AuBu g-j since k u i v /\v\(AuBuS) / 
0. Hence we obtain a contradiction to (15) since u' G A and v' G B. 

• Case 2 : p n (V \ (A U B U S)) ^ & V \ (A U 5 U S) is not empty. 

Now if 1/ \ (A U B U S 1 ) is empty then p has to lie entirely in A U B. This is because by 
assumption p does not intersect S. The case when p lies in A U I? is covered in Case 1 and 
hence it is assumed that y\(iUBU5) / 0. 4 

In this case there exists a pair of vertices (u', v') G A x B with u', v' G p, such that the 
vertices u' and i/ are connected by exactly one path p' C p in the induced graph (Go)iy 
of G by W = {u', v'} U V \ (A U B U 5) (see Lemma 1) 5 . 

Let us now use Theorem 4 to compute the coefficient fc u v|V\(Au.BuS)> i- e -> tne w — coefficient 
in the inverse of the covariance matrix of the random vector X\y = (X w , w G W)' = 
(X u > , X v , , X v \ {AuBuS) y. We obtain that 

|S(W\j/)| 

where E(V7) denotes the covariance matrix of Xw and £(W \ p') denotes Y,(W) with 
the rows and the columns corresponding to variables in path p' omitted. One can therefore 
easily deduce from (16) that k u r v r\ v \^ AuBuS ^ ^ 0. Thus X u i is not independent of X v i 
given X v \( AuBuS y Hence once more we obtain a contradiction to (15) since u' G A and 
v' G B. 



Remark. The dual result of the theorem above for the case of concentration trees was proved 
by Becker et al. (2005). We note however that the argument used in the proof of Theorem 3 
cannot also be used to prove faithfulness of Gaussian distributions that have trees as concen- 
tration graphs. The reason for this is as follows. In our proof we employed the fact that the 
sub-graph (Go){ UjV }us of Go induced by a subset {n, v } U 5 C V is also the covariance graph 
associated with the Gaussian sub-random vector of Xy as denoted by X{ U ^ V } US = (X w , w G 
{u, v} U S)'. Hence it was possible to compute the coefficient k uv \ s which quantifies the 
conditional (in)dependence between u and v given S, in terms of the paths in (Go){ u ^ v }us 

4 As an illustration of this point consider once more the graph presented in Figure 1. Consider A — {1,2}, 
B = {7, 8} and S = {4, 6}. Here V \ (A U B U S) = {3, 5} and the path p = (1, 2, 3, 5, 7, 8) connecting A and B 
intersects V\(AU BUS). 

5 In our example in figure 1 with A = {1, 2}, B — {7, 8} and S = {4, 6} , the vertices u' and v' will correspond 
to vertices 2 and 7 respectively, and p' = (2, 3, 5, 7), which is a path entirely contained in V\(iUBUS)U{«',))'}- 
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and the coefficients of the covariance matrix of ~S.{ u ^ v }us = {X w , u G {u,v} U S)'. On 
the contrary, in the case of concentration graphs the sub-graph G{ Ui „}us of the concentration 
graph G induced by {u, v} U S is not in general the concentration graph of the random vector 
X{u,u}us = (X w , u G {u, v} U Sy. Hence our approach is not directly applicable in the con- 
centration graph setting. 



5 Conclusion 

Faithfulness of a probability distribution to a graph is a crucial assumption that is often made in 
the probabilistic treatment of graphical models. This assumption describes the ability of a graph 
to reflect or encode the multivariate dependencies that are present in a joint probability distri- 
bution. Much of the methodology in this area often do not undertake a detailed analysis of the 
faithfulness assumption, as such an endeavor requires a more careful and rigorous probabilistic 
study of the joint distribution at hand. In this note we looked at the class of multivariate Gaus- 
sian distributions that are Markov with respect to covariance graphs and prove that Gaussian 
distributions which have trees as their covariance graphs are necessarily faithful. The method 
of proof that is employed in this paper is novel in the sense that it is self contained and yields 
a completely new approach to demonstrating faithfulness - as compared to the methods that are 
traditionally used in the literature. Moreover, it is also vastly different in nature from the proof 
of the analogous result for concentration graph models. Hence the approach used in this paper 
promises to have further implications and give other insights. Future research in this area will 
explore if the techniques used in this paper can be modified to prove or disprove faithfulness for 
other classes of graphs. 

Acknowledgments 

The authors gratefully acknowledge the faculty at Stanford University for their feedback and 
tremendous enthusiasm for this work. 

References 

Banerjee, M., & Richardson, T. 2003. On a Dualization of Graphical Gaussian Models: A 
Correction Note. Scand. J. Statist, Vol 30, 817-820. 

Becker, Ann, Geiger, Dan, & Meek, Christopher. 2005. Perfect Tree-like Markovian Distribu- 
tions. Probability and Mathematical Statistics, 25(2), 231-239. 

Cox, D. R., & Wermuth, N. 1996. Multivariate Depencies : Models, Analysis and Interpreta- 
tions. Chapman and Hall. 

Cox, D.R., & Wermuth, M. 1993. Linear dependencies represented by chain graphs (with Dis- 
cussion). Statist. Scl, 8, 204-218, 247-277. 



13 



Hammersly, J. M, & Clifford, P. E. 1971. Markov fields on finite graphs and lattices. Unpub- 
lished manuscript. 

Ji, C, & Seymour, L. 1996. A consistent model selection procedure for Markov random fields 
based on penalized pseudolikelihood. The Annals of Applied Probability, 6(2), 423-443. 

Jones, B., & West, M. 2005. Covariance decomposition in undirected Gaussian graphical models. 
Biometrika, 92, 770-786. 

Kauermann, G. 1996. On a dualization of graphical Gaussian models. Scand. J. Statist., 23, 
105-116. 

Khare, K., & Rajaratnam, B. 2009. Wishart distributions for decomposable covariance graph 
models, under review in the Annals of Statistics. 

Kindermann, R., & Snell, J. L. 1980. Markov Random Fields and Their Applications. American 
Mathematical Society, Providence, Rhode Island. 

Kunsch, H., Gemanand, S., & Kehagias, A. 1995. Hidden Markov Random Fields. The Annals 
of Applied Probability, 5(3), 577-602. 

Lauritzen, S. L. 1996. Graphical Models. New York : Oxford University Press. 

Malouche, D., & Rajaratnam, B. 2009. Analysis of the faithfulness assumption in Graphical 
Models. Technical Report, Department of Statistics, Stanford University. 

Pearl, J. 1988. Probabilistic Reasoning in Intelligent Systems. Tech. rept. Morgan Kaufman. 

Spitzer, C. 1975. Markov random fields on an infinite tree. The Annals of Probability, 3, 387- 
398. 



14 



